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Abstract 

The ability of a robot to detect and respond to 
changes in its environment is potentially very use- 
ful, as it draws attention to new and potentially 
important features. We describe an algorithm for 
learning to filter out previously experienced stim- 
uli to allow further concentration on novel features. 
The algorithm uses a model of habituation, a bi- 
ological process which causes a decrement in re- 
sponse with repeated presentation. Experiments 
with a mobile robot are presented in which the 
robot detects the most novel stimulus and turns 
towards it ('neotaxis'). 



1 Introduction 

Many animals have the ability to detect novelty, 
that is to recognise new features or changes within 
their environment. This paper describes an algo- 
rithm which learns to ignore stimuli which are pre- 
sented repeatedly, so that novel stimuli stand out. 
A simple demonstration of the algorithm on an 
autonomous mobile robot is given. We term the 
robot's behaviour of following the most novel stim- 
ulus neotaxis, meaning 'turn towards new things', 
taken from the Greek (neo = new, taxis = follow). 
A number of different versions of the novelty filter 
are described and compared to find the best for the 
particular data used. 

Attending to more novel stimuli is a useful ability 
for a mobile robot as it can limit the amount of data 
which the robot has to process in order to deal with 
its environment. It can be used to recognise when 
perceptions are new and must therefore be learned. 
In addition, it means that the robot can be used 
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as an inspection agent, so that after training to 
learn common features it will highlight any 'novel' 
stimuli, i.e., those which it has not seen previously. 

1.1 Related Work 

A number of novelty detection methods have been 
proposed within the neural network literature, but 
they are mostly trained off-line. Particularly note- 
worthy is the Kohonen Novelty Filter |[J, |IJ], 
which is an auto-encoder neural network trained 
by back-propagation of error. After training, any 
presentation to the network produces one of the 
trained outputs, and the bitwise difference between 
the input and output shows the novel parts of the 
input. This work has been extended by a number of 
authors. For example, Aeyels [|J adds a 'forgetting' 
term into the equations. 

Ho and Rouat Jll| use a biologically inspired 
model that times how long an oscillatory network 
takes to converge to a stable output, reasoning that 
previously seen inputs should converge faster than 
novel ones. Finally, Levine and Prueitt |15| use the 
gated dipole proposed by Grossberg j|, || to com- 
pare inputs with pre-defined ones, novel features 
causing greater output values. 

2 The Novelty Filter 
2.1 Habituation 

Habituation is a reduction in behavioural response 
that occurs when a stimulus is presented to an or- 
ganism repeatedly. It is present in many animals, 
from the sea slug Aplysia [|J7| through toads H p2| 
and cats [[l9) to humans [17 . It has been mod- 
elled by Groves g(|, Stanley g§ and Wang and 
Hsu |23|. Habitation differs from other processes 
which decrement synaptic efficacy, such as fatigue, 



in that a change in stimulus restores the response 
to its original levels. This process is called disha- 
bituation. There is also a 'forgetting' effect, where 
a stimulus which has not been presented for a long 
time recovers its response. Further details can be 
found in §C| fn). 

The habituation mechanism used in the system 
described here is Stanley's model. The synaptic 
efficacy, y(t), decreases according to the following 
equation: 
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a[yo-y(t)] -S(t), 



(1) 



where yo is the original value of y, r and a are 
time constants governing the rate of habituation 
and recovery respectively, and S is the stimulus 
presented. The effects of the equation are shown 
in figure [l| The principal difference between this 
and the model of Wang and Hsu is that the latter 
allows for long-term memory, so repeated training 
causes faster learning. 



Habituation Training with Equation (1 ) 




Figure 1: An example of how the synaptic efficacy drops 
when habituation occurs. In the first, descending part of 
the graph, a stimulus S(t) — 1 is presented continuously. 
This changes to S(t) = at t — 150 where the synaptic 
efficacy rises again, and becomes S(t) — 1 again at t = 
200, causing another drop. The two curves show different 
values of the constants, in series la — 1.05 and in series 
2 a = 1.2. In both, r = 20 and y = 1.0. 

Figure [l] shows the synaptic efficacy increasing 
again at time 150, when the stimulus is removed. 
This is effectively a 'forgetting' effect, and is caused 
by a dishabituation mechanism which increases the 
strength of synapses that do not fire. In the im- 
plementation described here this effect can be re- 
moved. The experiments reported in section ^ in- 
vestigate effects of the filter both with and without 
forgetting. 
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Figure 2: The novelty filter. The input layer connects 
to a clustering layer which represents the feature space, the 
winning neuron (i.e., the one 'closest' to the input) passing 
its output along a habituable synapse to the output neuron 
so that the output received from a neuron reduces with 
the number of times it fires. 



2.2 Using Habituation for a Novelty 
Filter 

The principle behind the novelty filter is that per- 
ceptions are classified by some form of clustering 
network, whose output is modulated by habituable 
synapses, so that the more frequently a neuron fires, 
the lower the efficacy of the synapse becomes. This 
means that only novel features will produce any no- 
ticeable output. If the habituable synapses receive 
zero input (rather than none) during turns when 
their neuron does not fire, the synapses will 'for- 
get' the inhibition over time, providing that this 
forgetting mechanism (or dishabituation) is turned 
on. 

The choice of clustering algorithm is very impor- 
tant and depends on the data being classified. In 
this paper, we compare the performance of three 
different networks, described below, on the robot 
application. The three networks described were 
chosen because they performed best on sample data 
that was selected to be similar to that they would 
see on the robot. In addition to those described 
below, the Neural Gas |L6| network also performed 
well, but computational constraints means that it 
was not possible to run it on the robot. 



2.3 Some Possible Clustering Net- 
works 



2.3.1 Kohonen's 
(SOM) 



Self-Organising Map 



Kohonen's Self-Organising Map |T3| works in the 
following way: 

Every element of the input vector is connected to 
every node of the map by a modifiable connection. 
The distance d between the input and each of the 
neurons in the field is calculated using 
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d=£[v(*)- Wi (t)] 2 



(2) 



i=0 



where v(i) is the input vector at time t and w, 
the weight between input i and the neuron. In a 
Learning Vector Quantiser jTj|, used here, the neu- 
ron with the minimum d is selected and the weight 
for that neuron and its topological neighbours are 
updated by: 



Wi(f + 1) = Wi(t) + v(t) [v(t) - Wi(t)] (3) 

where rj is the learning rate, < r\ < 1. 

Usually, a two-dimensional SOM is used, but in 
the implementation described here a ring-shaped 
network, effectively a line with the end neurons 
linked together, was used. The neighbourhood size 
and learning rate remained constant so that the 
system was always learning. The neighbourhood 
comprised only the nearest neighbours of each neu- 
ron, and rj was fixed at 0.25. 

2.3.2 The Temporal Kohonen Map (TKM) 

This self-organising map, proposed by Chappell 
and Taylor j|, is based on Kohonen's SOM, but 
uses "leaky integrator" neurons whose activity de- 
cays exponentially over time. The exponential de- 
cay is controlled by a time constant (7 in equa- 
tions ^ and H below) . This is similar to a short-term 
memory, allowing previous inputs to have some ef- 
fect on the processing of the current input, so that 
the neurons which have won recently are more likely 
to win again. In the experiments reported here the 
value 7 = 0.4 was used, meaning that only the pre- 
vious 2 or 3 winners had any influence in deciding 
the current winner. The activity of the neurons is 
calculated using 



a i (t)=7-a i (t-l) + e(- 1 *>W- v "W 2 



and, in a similar way to the SOM, the neuron with 
the largest activity a is chosen as winner, and its 
weights and those of its topological neighbours up- 
dated using the following weight update rule (77 and 
the neighbourhood remained the same): 



w ( (t + l) =w J (<)+r ? ^7 fc [v(t - k) - Wi(t - k)} . 

(5) 
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2.3.3 The K Means Clustering Algorithm 

One of the simplest ways to cluster data is by us- 
ing the i^-means algorithm [||. A pre-determined 
number of prototypes, [A, are chosen to represent 
the data, so that it is partitioned into K clusters. 
The positions of the prototypes are chosen to min- 
imise the sum-of-squares clustering function, 



K 



J = EE 



(6) 



j = l neSj 



for data points x n . This separates the data into 
K partitions Sj . The algorithm can be carried out 
as an on-line or batch procedure, with the on-line 
version, used here, having the update rule 



(7) 



(4) 



3 Using the Novelty Filter on 
a Mobile Robot 

The robot implementation was designed to show 
that the novelty filter described in section 2.2 can 
be used to detect new stimuli. The novelty fil- 
ter was incorporated into a system where a robot 
detects and turns towards new stimuli. It was 
implemented on a Fischer Technik mobile robot, 
which uses a Motorola 68HC11 microcontroller. 
The robot has a two wheel differential drive sys- 
tem and four light sensors facing in the cardinal 
directions. 

In the experiments described below, the robot 
received a number of different light stimuli, which 
varied in the frequency of the flashes. It classified 
these stimuli autonomously and decided whether or 
not to respond (turn towards the source) according 
to how novel they were. Each of the sensors on 
the robot, in this case four light sensors, had its 
own novelty filter, as shown in figure ^. At each 
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Figure 4: The overall system for choosing the most interesting stimulus. Each sensory perception is classified separately 
by a novelty filter - which receives an input of present and recent perceptions - and a value indicating the novelty of that 
stimulus is output. Completely new stimuli are given a higher priority. The most novel stimulus was selected for a response, 
providing it exceeded a pre-defined threshold. 




Figure 3: The Fischer Technik Robot used in the 
experiments. The light sensors used in the experiments 
can be seen at the top of the mast towards the front of 
the robot. 

cycle, the current reading on each sensor was con- 
catenated with the previous five to form a six el- 
ement input vector, known as a delay line or lag 
vector. This vector was classified by the novelty 
filter and an output produced. In the case of the 
TKM, which keeps an internal history of previous 
inputs, only the most recent reading was needed as 
input. 

The output of the filter was a function of how 
many times that neuron had fired before, due to 
the habituating synapse. Each of the four novelty 
filters fed their output to a comparator function 



which propagated the strongest signal, providing 
that it was above a pre-defined threshold, to the ac- 
tion mechanism. If none of the stimuli were strong 
enough, the cycle repeated. Owing to memory con- 
straints, the clustering mechanism was limited to 
just twelve neurons arranged in a ring. All three of 
the networks described in section [2.3] were the same 
size. 

A bypass function was associated with each sen- 
sor. If a neuron had not fired before (that is, its 
synapse had not been habituated) the comparator 
function favoured it, so that the system responded 
rapidly to new signals. If two new signals were de- 
tected simultaneously, the stronger one was used. 

4 Experiments and Results 

Three separate experiments were carried out. The 
first, the results of which are shown in figure || and 
table [j], was designed to test the forgetting mecha- 
nism as well as the general ability to turn towards 
novel stimuli. The robot was initially placed in a 
featureless environment. A light was introduced 
to project onto one of the light sensors. Once the 
robot had turned to face this light source, a second, 
slowly flashing light was added. As this light was 



more novel, the robot turned towards it. A further, 
faster flashing light was then introduced, which the 
robot again faced. Finally, the constant light was 
switched off and, in the case where a 'forgetting' 
mechanism was used, the robot perceived this lack 
of stimulus as novel and turned back towards it. 
Otherwise it did not respond. 

In the second experiment, steps (a) and (b) of 
figure H were again followed. However, instead of a 
faster flash being shown in the third stage, a sec- 
ond flashing light of the same (slow) frequency was 
shown. If the flashing light was still novel, the robot 
turned towards this as it was a newer version of the 
most novel stimulus. However, if the flashing light 
had ceased to be novel, the robot ignored it. 

Finally, instead of a second flashing light in 
part (c), a second constant light was introduced. 
Whether or not the robot responded to this de- 
pended on whether or not the forgetting mechanism 
was switched on and which sensor it was on - if it 
was a sensor which had not previously seen it, the 
robot responded. 

Table |] shows the reactions of the robot in the 
three experiments, both with forgetting turned on 
and off. The constants used for the experiments 
were: r = 0.1, a = 0.5, (3 — 0.1 and a bore- 
dom threshold (i.e., the value below which a stimuli 
ceased to be novel) of 0.4. The parameters of the 
networks were kept at the levels found to be optimal 
in simulations. The overall qualitative results were 
the same for all three networks, although the SOM 
took longer to produce consistent output when a 
new pattern was introduced (owing to the changes 
in the spatial pattern in the lag vector) while the 
TKM responded to them quickly. 

In table ^ it can be seen that particular inputs 
caused the robot to move even when the stimulus 
had been seen before. This occurred because the 
stimulus was on a sensor which had not perceived 
it previously. This meant that the robot's attention 
was changing unnecessarily, so a method to rectify 
this was devised. When a stimulus is marked as 
novel the robot rotates through 360°, pausing ev- 
ery 90°, so that each of the novelty filters learns 
to recognise all the stimuli. This means that the 
robot reacts to stimuli in the same way regardless 
of which sensor they impinge on. This functional- 
ity can be produced in other ways, such as using 
one novelty filter to monitor all the sensors and 
adding additional memory of what each sensor was 
seeing to turn the robot in the appropriate direc- 
tion. The output of the network took a few itera- 
tions to stabilise for each new input, and the SOM 
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Figure 5: Figures showing the behaviour of the 

robot during the four stages of the first experiment with 
forgetting. The motion of the robot is shown using the 
dotted lines. In (a) the robot turns towards the new light, 
in (b) it turns towards the newer flashing light, and then 
in (c) to the faster flashing light. Finally in (d) it turns 
back to the point where the light has been turned off. 



Experiment 


Forgetting 


Stage 


Action 


1 


On 
Off 


Constant On 
Slow Flashing On 
Fast Flashing On 

Constant Off 

Constant On 
Slow Flashing On 
Fast Flashing On 

Constant Off 


Robot turns towards it 
Robot turns towards it 
Robot turns towards it 
Robot turns towards it 
Robot turns towards it 
Robot turns towards it 
Robot turns towards it 
Robot does not respond 


2 


On 
Off 


Constant On 
Slow Flashing On 
Slow Flashing On 

Constant On 
Slow Flashing On 
Slow Flashing On 


Robot turns towards it 

Robot turns towards it 
If on a different sensor, robot turns towards it 

Robot turns towards it 

Robot turns towards it 
If on a different sensor, robot turns towards it 


3 


On 
Off 


Constant On 
Slow Flashing On 

Constant On 

Constant On 
Slow Flashing On 

Constant On 


Robot turns towards it 

Robot turns towards it 
If on a different sensor, robot turns towards it 

Robot turns towards it 

Robot turns towards it 
If on a different sensor, robot turns towards it 



Table 1: A description of the robots behaviour in the first series of experiments. 



in particular occasionally generated spurious read- 
ings, caused by misreading the signals so that the 
input vector varied. This was usually because the 
sensor polling could not be precisely timed, so that 
occasionally the time between readings varied and 
so an unexpected input was received. 

4.1 Further Experiments 

In the experiments described previously, all three 
clustering networks showed similar qualitative re- 
sults. For this reason, further tests were de- 
signed to try and discriminate between the net- 
works. The additional experiments performed in- 
volved using flashing lights which flashed at varying 
speeds. The neotaxis behaviour of the robot re- 
mained fixed. Two additional patterns of flashing 
lights were used, short-short-long-long and short- 
long-short-long, which the K-Means network and 
Temporal Kohonen Map both recognised more ac- 
curately than the SOM. The TKM in particular 
dealt with all the stimuli very well, but the SOM 
was occasionally subject to errors and took longer 
to respond. The number of patterns which it is pos- 
sible for the robot to learn and recognise is limited 
by the size of the network. 



5 Conclusions and Future 
Work 

The mechanism described here is capable of recog- 
nising features which vary in time and habituat- 
ing to those that are seen repeatedly. In this way 
it successfully acts as a novelty filter, highlighting 
those stimuli which are new and directing attention 
towards them. This is a useful ability, since it can 
reduce the amount of data which the robot needs to 
process in order to deal with its environment. How- 
ever, in the application described here, the inputs 
are fairly clean, the environment being designed to 
produce differentiable inputs. 

One of the assumptions that is made in this paper 
is that the clustering networks used will reliably 
separate the inputs so that new stimuli cause a new 
neuron to win, and old stimuli activate the same 
neuron each time. This is not necessarily true, and 
the potential problems this highlights need to be 
investigated. Using a growing network such as the 
Growing Neural Gas of Fritzke |6j is one solution, 
as is using a Mixture of Experts §1| in place of 
the clustering network, each expert recognising a 
different part of the input space. 

In addition, the sensors used here, photocells, are 
crude and do not give a great deal of information, 



and the robot has very limited memory. To produce 
a system which is capable of interacting with real 
world environments it will be necessary to use more 
and better sensors. The next step will be to transfer 
the system onto the Manchester Nomad 200 robot, 
Forty Two, and take advantage of the sensors avail- 
able, viz. sonar, infra-red and a monochrome CCD 
camera. Before the novelty filter can deal with this 
information, sensor inputs will have to be exten- 
sively preprocessed, with features extracted from 
the images. Work using sonar scans taken whilst 
the robot is exploring an environment have shown 
success in applying the novelty filter to a real world 
problem (work to be published). 

However, once data about the surrounding envi- 
ronment can be interpreted, the novelty filter pre- 
sented here can be used in an inspection agent 
which learns a representation of an environment 
and can then explore and detect new or changed 
features within both that and similar environments. 
This is the ultimate aim of this research. 
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